Scalable Text Index Construction

نویسندگان

چکیده

Abstract We survey recent advances in scalable text index construction with a focus on practical algorithms distributed, shared, and external memory.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Scalable Construction of Text Indexes

The suffix array is the key to efficient solutions for myriads of string processing problems in different applications domains, like data compression, data mining, or Bioinformatics. With the rapid growth of available data, suffix array construction algorithms had to be adapted to advanced computational models such as external memory and distributed computing. In this article, we present five s...

متن کامل

Scalable online index construction with multi-core CPUs

Inverted index is a core element of current text retrieval systems. They can be dynamically constructed using online indexing approaches in the environment which even a small delay in timeliness cannot be tolerated, and the index must always be queryable and up to date. Recently, efficient online index construction schemes have been proposed, however, previous works have not focused on scalabil...

متن کامل

Efficient single-pass index construction for text databases

Efficient construction of inverted indexes is essential to provision of search over large collections of text data. In this article, we review the principal approaches to inversion, analyze their theoretical cost, and present experimental results. We identify the drawbacks of existing inversion approaches and propose a single-pass inversion method that, in contrast to previous approaches, does ...

متن کامل

Scalable Parallel Suffix Array Construction

Suffix arrays are a simple and powerful data structure for text processing that can be used for full text indexes, data compression, and many other applications in particular in bioinformatics. We describe the first implementation and experimental evaluation of a scalable parallel algorithm for suffix array construction. The implementation works on distributed memory computers using MPI, Experi...

متن کامل

Scalable $k$-NN graph construction

The k-NN graph has played a central role in increasingly popular data-driven techniques for various learning and vision tasks; yet, finding an efficient and effective way to construct k-NN graphs remains a challenge, especially for large-scale high-dimensional data. In this paper, we propose a new approach to construct approximate k-NN graphs with emphasis in: efficiency and accuracy. We hierar...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Lecture Notes in Computer Science

سال: 2022

ISSN: ['1611-3349', '0302-9743']

DOI: https://doi.org/10.1007/978-3-031-21534-6_14